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Abstract— Distributed Space-Time Block Codes (DSTBCs) 
from Complex Orthogonal Designs (CODs) (both square and 
non-square CODs other than the Alamouti design) are known to 
lose their single-symbol ML decodable (SSD) property when used 
in two-hop wireless relay networks using amplify and forward 
protocol. For such a network, in this paper, a new class of high 
rate, training-embedded (TE) SSD DSTBCs are constructed from 
TE-CODs. The proposed codes include the training symbols in 
the structure of the code which is shown to be the key point to 
obtain high rate as well as the SSD property. TE-CODs are shown 
to offer full-diversity for arbitrary complex constellations. Non- 
square TE-CODs are shown to provide higher rates (in symbols 
per channel use) compared to the known SSD DSTBCs for relay 
networks with number of relays less than 10. 

I. Introduction and Preliminaries 

Distributed space-time coding has been a powerful tech- 
nique for achieving spatial diversity in wireless networks 
with single antenna terminals. An excellent introduction to 
cooperative communications based on distributed space-time 
coding in two-hop wireless networks can be seen in [1], 
[2] and the references within. The technique involves a two 
phase protocol where, in the first phase, the source broadcasts 
the information to the relays and in the second phase, the 
relays linearly process the signals received from the source 
and forward them to the destination such that the signal at 
the destination appears as a Space-Time Block Code (STBC). 
Such STBCs, generated distributively by the relay nodes, are 
called Distributed Space-Time Block Codes (DSTBCs). 

In a co-located Multiple Input Multiple Output (MIMO) 
channel, an STBC is said to be Single-Symbol Maximum 
Likelihood (ML) Decodable (SSD) if the ML decoding metric 
splits as a sum of several terms, with each term being a 
function of only one of the information symbols [3]. Since 
the work of [1], [2], considerable efforts have been made to 
design SSD DSTBCs. A DSTBC is said to be SSD if the 
STBC seen by the destination from the set of relays is SSD. 

DSTBCs with single-symbol ML decodability was first 
introduced for cooperative networks in [4]. Further, in [5], 
high-rate, SSD DSTBCs have been proposed wherein the 
source performs linear precoding of information symbols 
before transmitting it to all the relays. For the class of codes 
proposed in [4] and [5], the channel model is such that each 
relay is assumed to know only the statistics of the channel 



from the source to itself (but not their realizations). In [6] and 
[7], SSD DSTBCs are proposed for the case where every relay 
node is assumed to have the perfect knowledge of the phase 
component of the channel from the source to the relay. An 
upper bound on the symbol rate for such a set up is shown 
to be i (in complex symbols per channel use in the second 
phase) which is independent of the number of relays. However, 
these codes have exponential decoding delay whereas the 
codes in [4] and [5] are of minimal delay. Moreover, in the 
model considered in [6] and [7], training sequences have to 
be transmitted from the source to the relays since each relay 
needs to know the phase component of the channel from the 
source to itself. Therefore, the source needs to use some of 
the resources such as power and bandwidth for transmitting 
the training sequences. In [6] and [7], the number of channel 
uses spent on transmitting training signals are not accounted 
in computing the rate of the DSTBCs. 

For point to point co-located MIMO channels, complex 
orthogonal designs (CODs) [8], [9], coordinate interleaved 
orthogonal designs (CIODs) [3] and Clifford unitary weight 
designs (CUWDs) [10] are well known for their SSD property 
when used to generate STBCs. Note that, with the assumption 
of the knowledge of the phase component of the source-relay 
channel at the relays, all CODs can be constructed as DSTBCs 
[11]. The extensions of CODs such as CIODs and CUWDs 
can also be distributively constructed. However, CODs (other 
than the Alamouti design), CIODs and CUWDs (other than 
that for 4 antennas) do not retain the SSD property. 

In this paper, we propose high rate, training embedded SSD 
DSTBCs. The proposed codes include the training symbols 
in the structure of the code which is shown to be the key 
point to obtain high rate as well as the SSD property. On 
the similar lines of the work in [6], [7], the relay nodes are 
assumed to have the knowledge of the phase component of the 
channel from the source to itself. In this paper, the number of 
channel uses spent on transmitting training signals from the 
source to the relays are accounted in computing the rate of 
the proposed DSTBCs. The main contributions of this paper 
and the organization can be summarized as follows: 

• We propose a novel method to construct high rate (in 
symbols per channel use), SSD DSTBCs for two-hop 
wireless relay networks based on the amplify and forward 



protocol. The proposed method has an in-built training 
scheme for the relays to learn the phase components of 
their backward channels. The in-built training symbols is 
shown to be the key point to obtain high rate as well as 
the SSD property (Section HI]). 

• When all the zero entries of a COD (square or non- 
square) are replaced by a constant, the resulting design 
is called a Training-Embedded-COD (TE-COD). These 
are shown to generate SSD DSTBCs. This essentially 
enables all CODs to be usable as SSD DSTBCs with full- 
diversity for arbitrary complex constellations. Compared 
to the existing SSD codes of [7] (where the number of 
channel uses spent in sending the training symbols are not 
included in calculating the rate of the DSTBCs), the class 
of non-square TE-CODs are shown to provide higher 
rates for two-hop networks with number of relays less 
than 10 (Section ITTTl. We highlight that the class of non- 
square TE-CODs provide higher rates than those in [7] 
even though the number of channel uses spent in sending 
the training symbols are not included in calculating the 
rate of the schemes in [7]. 

> Simulation results for 4 relays are presented which show 
that the proposed scheme performs better than the code 
presented in [7] by 0.5 db (Section HVb. 

Notations: Throughout the paper, lower case boldface letters 
and capital boldface letters are used to represent vectors and 
matrices respectively. For a complex matrix X, the matrices 
X*, X T , X H , |X|, Re X and Im X denote, respectively, the 
conjugate, transpose, conjugate transpose, determinant, real 
part and imaginary part of X. The element in the 7'i-th row and 
the r2-th column of the matrix X is denoted by [X] ri . r2 . The 
TxT identity matrix and the TxT zero matrix are respectively 
denoted by It and 0t ■ The magnitude of a complex number x, 
is denoted by \x\ and E [x] is used to denote the expectation 
of the random variable x. A circularly symmetric complex 
Gaussian random vector x, with mean and covariance matrix 
r is denoted by x ~ CSCQ (n,T). The set of all integers, 
the real numbers and the complex numbers are respectively, 
denoted by Z, R and C and i is used to represent s/—l. 

II. Training- Embedded Precoded Distributed 
Space-Time Coding 

A. Signal Model 

The wireless network considered as shown in Fig.[T|consists 
of K + 2 nodes, each having a single antenna. There is one 
source node and one destination node. All the other K nodes 
are relays. We denote the channel from the source node to the 
A-th relay as h\ and the channel from the A-th relay to the 
destination node as g\ for A = 1,2, ••• ,K. The following 
assumptions are made in our model: 

> All the nodes are half duplex constrained. 

• Fading coefficients h\ and g\ are i.i.d CSCQ (0, 1) with a 
coherence time interval of at least N and T channel uses 
respectively, where N and T are the number of channel 
uses in the first phase and the second phase, respectively. 



• All the nodes are synchronized at the symbol level. 

> Relay nodes have the knowledge of only the phase 

components of the fade coefficients h\. 
« Destination knows all the fade coefficients g\, h\ for 

A = 1,2,- ■ • , K. 
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Fig. 1. Wireless relay network model. 

The source is equipped with a codebook S = 
{xi, X2, X3, • • • , xl} consisting of information vectors 
x/ e C Wxl such that E [xfx;] = 1. The information vectors 
are of the form, 

x = [ a a ■ ■ ■ a x\ X2 • • ■ Xk] T € C Wxl 

[ T Z ] times 

where the complex variables x\ , x% • • • Xk take values from 
a complex signal set denoted by A4, a e C is a non-zero 
complex constant chosen as the training symbol and N = 
\ T ^ hr \ + k. The value of a is chosen such that the condition 
E [x^x;] = 1 is satisfied. The value of a is assumed to be 
known to all the relays and the destination. In the first phase, 
the source broadcasts the vector x to all the K relays (but not 
to the destination which is assumed to be located far from the 
source). 

The received vector at the A-th relay is given by r\ = 
VHiV/iAX + n A e C Nxl , for all A = 1, 2, • • • , K where 
Tlx ~ CSCQ (OjvxIjIjv) is the additive noise at the A-th relay 
and Pi is the total power used at the source node for every 
channel use. Using the N = [^-j^] + k length vector, r\, 
the A-th relay constructs the T— length new vector f\ given 
by (fl3 shown at the top of this page, where r>(i) denotes the 
i-th component of the vector r A . The A-th relay is assumed to 
obtain a perfect estimate of the phase component of h\ using 
the training symbols sent during the first [~-^-^] channel uses 
in the first phase. This has enabled the phase compensation in 
([T]i which can also be given by 

r x = x /l\N\h x \x + n x eC Txl 

where 

x = [ a a ■ ■ ■ a x± x 2 ■ ■ ■ Xk] T £ C Txl . (5) 
T~k times 
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Note that the concatenating operation in ([]]) continues to 
keep the components of n A identically distributed and uncor- 
rected to each other. 

In the second phase, all the relay nodes are scheduled to 
transmit T length vectors to the destination simultaneously. 
Each relay is equipped with a fixed pair of matrices A A , 
B A G C TxT and is allowed to linearly process the vector 
f a- The A-th relay is scheduled to transmit 
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where P 2 is the total power used at each relay for every 
channel use in the second phase and P r is the average norm of 
the vector r A . The vector received at the destination is given 
by 
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where w ~ CSCQ (Otxi,It) is the additive noise at the 
destination. Substituting for t A , y can be written as 
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• n = 

• The equivalent 

[\hi\gi \h 2 \g 2 ••• \h K \g K \* G 

• Every codeword X G C TxA which is of the form (O 
(shown at the top of this page) is a function of the 
information vector x through x. 

The covariance matrix R G C TxT of the noise vector n is 
given in ® (top of this page). Note that R depends on the 
choice of the relay matrices A A and B A . The relay matrices 
needs to be chosen such that the resulting code seen by the 
destination is SSD. 

The Maximum Likelihood (ML) decoder for x is given by 
x shown in (|4]i at the top of this page. 

Definition 1: The collection C of T x K codeword ma- 
trices given by ©, 



C = {X | V x G S} 



(7) 



is called a Training-Embedded Distributed Space-Time Block 
Code (TE-DSTBC) which is determined by the sets {A A ,B A } 
and S. 

Note that unlike the existing DSTBCs, TE-DSTBCs contain 
the training symbols in the code structure along with the infor- 
mation symbols justifying their name. In the following section, 
we show that this training-embedding enables construction of 
SSD TE-DSTBCs. 

III. TE-DSTBC FROM TE-CODS 

In this section, we construct two classes of TE-DSTBC 
(square and non-square TE-DSTBC) that are single-symbol 
ML decodable at the destination. The proposed designs are 
derived from the well known class of complex orthogonal 
designs (CODs) [8]. The proposed class of complex designs 
are introduced in the following definition. 

Definition 2: Let the T x K matrix X represent a COD in k 
complex variables. If the zeros in the design X are replaced by 
a non-zero constant say a G C, then we refer X as a TE-COD. 

Note that the above definition holds both for the classes of 
square CODs (when T = K) as well as non-square CODs 
(when T > K). 

Example 1: For the well known 4x4 COD [8] of rate 
|, with xu x 2 and x$ being the complex variables, the 
corresponding TE-COD is given by, 



X TE-COD 



X3 


a 


X 2 


Xl 


a 


■'■;>, 


* 


—x 2 


•'2 


•''1 




a 


x\ 


-X2 


a 


~&3 



(8) 



In general, given a T x K TE-COD, ^■■jYL-COli m variables, 
every column of X contains exactly k distinct variables and 
T — k copies of a. Since X is a linear design [12] in the 
constant a and the variables Xi's, the design X can also be 
written as 

X TE-COD = [Ci* + D ix* • ■ ■ C A-x + D A -x*] G C TxK (9) 
where 

x= [a a ••• a x x x 2 ■ ■ ■ x k f G C Txl (10) 
T-k times 



and C\, Da G C TxT , A = 1, 2, ■ • • , K, are the column-vector 
representation matrices of X-p£_£Qj-) [9]. The number of a's 
in the vector x is equal to the number of a's in every column 
of TE-COD. The following theorem provides an important 
relation satisfied by the matrices Ca,Da of TE-CODs. 

Theorem 1: The column- vector representation matrices 
Ca,Da of a TE-COD, X-pg.^QD ( as re P resent ed in (O) can 
be chosen to satisfy the following relation, 

C A Cf +D A Df = I T VA = ltoJT. (11) 
Proof: Consider the column vector representation of TE- 
CODs as given in ([9}. Since the entries of Xj£_£Qj-) are of the 
form a, ±£j and ±x* V i = 1 to k and the vector x is given by 
([Tol l, it is straightforward to verify that the matrices , Di 2 , 
i\i *2 = 1, 2, • • ■ , K, satisfy the following three properties: 
> The entries of the matrices Cj^D^ are 0, ±1. 

• The matrices Cj l; Dj 2 can have at most one non-zero 
entry in every row. 

• The two matrices and D^ do not contain non-zero 
entries in the same row. 

Note that every complex variable appears exactly once (either 
as ±Xi or ±x*) in every column of the design. Without loss 
of generality, let us assume that l\ out of the k complex 
variables which appear in the A-th column of the design, 
A = 1, 2, • ■ ■ , K, are of the form ±£j. Then, the matrix C\ 
must have T — k + l\ non-zero rows (where l\ non-zero rows 
are for the variables and the remaining non-zero rows are for 
the a's). Further, as the remaining k — l\ variables appear as 
conjugates (i.e., of the form ±x*), the matrix Da must have 
k — l\ non-zero rows. 

Since there are T — k entries that are a in the vector x, the 
non-zero entries in the T — k non-zero rows, which are alloted 
for the T — k copies of a of C\ can be chosen to appear in 
different columns. Therefore, the columns of Ca and Da will 
have exactly one non-zero entry and hence they satisfy the 
relations given by ( fTTT i. ■ 

A. Distributed Construction of TE-CODs 

With reference to the distributed space-time coding tech- 
nique proposed in Section [n] in this section, we describe how 
to choose the sets {Aa , Ba | A = 1 to A'} and S such that a 
T x K TE-COD, X-p£_cQj-) in k variables can be constructed 
as the TE-DSTBC given in (0. Note that every column of 
^TE-COD contams exactly k distinct variables and T — k 
copies of a. 

After each relay performs the concatenation operation spec- 
ified in (JTJ, the vector V\ is given by f a = \fF\N\h\\x + h\ 
where the vector x is given by ©. Hence, the column vector 
representation matrices Ca and Da of Xj£_£Qj-) are the same 
as the relay matrices, Aa and Ba respectively. With the above 
choice on the sets {Aa,Ba | A = 1 to K} and S, a T x K 
TE-COD, X-p£_£Q£) in k variables can be constructed as a 
TE-DSTBC. 

Example 2: To construct the TE-COD given in Example [TJ 
the following ingredients are required at the various terminals. 
We have T = K = 4 and k = 3. The set S is given by 



S = {[a X\ X2 xz] T I V Xi G M}. The corresponding relay 
matrices Aa,Ba are given by 
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To implement the above design, the number of channel uses 
required in the first phase is 4 (3 channel uses for the variables 
and the rest for transmitting a). The number of channel uses 
in the second phase is also 4. Hence, the rate of this scheme 
is §. 

B. On the Single-Symbol ML Decodable Property of Dis- 
tributed TE-CODs 

Note that excluding the scaling factors and the constant 
terms, the ML decoding metric given in (01 is a function of 
the following two terms (i) g H X ff R _1 y and (ii) g ff X H R _1 Xg 
where R is a function of the set, {Aa,Ba} as given in (0. 
Also, from the results of TheoremQ] for the class of TE-CODs, 
R is a scaled identity matrix and hence the matrix X W R _1 X 
is the same as R _1 X ff X. Since X is an TE-COD, the matrix 
X H X can be written as a sum of k matrices where each matrix 
is strictly a function of only one of the real variables. For 
example, the matrix X H X for the square TE-COD for 4 relays 
given in Example [2] is given by < TT~2l > (at the top of the next 
page) in which, since X H X is a Hermitian matrix, we only 
present the elements on and above the main diagonal elements 
of X H X. Note that, X H X is not diagonal since all the 0's have 
been replaced by a. Hence, the ML decoding metric splits as 
a sum of several terms, with each term being a function of 
only one of the variables. Thus, when TE-CODs are applied as 
TE-DSTBC, every variable can be decoded independent of the 
other complex variables. Notice that when a = 0, the matrix 
X H X in (fl~2"l) becomes a scaled identity matrix corresponding 
to the well known CODs. 

C. Full Diversity of Distributed TE-CODs 

From the results of [2], a TE-DSTBC is fully diverse if for 
any two distinct codewords X x and X 2 of a TE-DSTBC, the 
matrix (Xi — X 2 ) H (Xi — X 2 ) is full rank. Since we employ 
a TE-COD to generate the TE-DSTBC, the difference matrix 
Xi — X2 gets a at the position where there is a in the 
design and hence the matrix (Xi — X2) H (Xi — X2) will be a 
diagonal one with full rank. Thus, TE-DSTBC generated from 
TE-CODs have full diversity property for arbitrary signal sets. 
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D. Rate of Distributed TE-CODs in Symbols per Channel Use 
In our proposed scheme, the total number of channel uses 
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involving both the first phase and the second phase is 
k+T. Therefore the rate of our scheme in symbols per channel 



for both the square and the non-square TE- 
CODs. Trie rate for square TE-CODs with 2 a number of relays 
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rate for non-square TE-CODs, from maximal rate CODs, for 
the cases of 2m or 2m — 1 relays is easily calculated to be 

T(m + 1) 

Tfr, ... — — ; ; — complex symbols per channel use. 

In [7], SSD-DSTBCs have been constructed with rate \ in 
complex symbols per channel use in the second phase. If the 
number of channel uses in the first phase is also considered, 
then the rate of such codes will be | (which doesn't include 
the number of channel uses needed for training in the first 
phase to calculate the rate). In Table Q] (shown at the top of the 
next page), we list the rates (including the channel uses in both 
the phases) of TE-DSTBCs from non-square TE-CODs (which 
includes the number of channel uses for training in the first 
phase to calculate the rate) for different values of K. When 
compared with the codes in [7], (which doesn't include the 
channel uses needed for training in the first phase) it is clear 
that for networks with K < 10, TE-DSTBCs from non-square 
TE-CODs provide higher rate than those of the codes in [7]. 
In [7], if the number of channel uses spent on transmitting 
the training symbols (from the source to the relays) is also 
included in calculating the rate, then the rate of such DSTBCs 
will be lesser than | and hence non-square TE-CODs provide 
higher rate gains than those listed in Table U 

E. Channel estimation at the destination 

For TE-DSTBCs, we assume one of the following two 
methods of channel estimation at the destination. 

• Since the destination receives a linear combination of 
the information symbols and the training symbols, it can 
possibly estimate all the channel gains using the symbols 
received during the T channel uses in the second phase. 
For a background on channel estimation with superpo- 
sition pilot sequence, we refer the readers to [13]. Note 
that for the DSTBCs proposed in [7], separate training 
symbols are needed (from the relays to the destination) 
for the destination to estimate the channels. As a result, 
the proposed scheme provides further advantage in the 
overall rate (when the number of channel uses in sending 
training symbols from the relays to the destination is 
also included in the calculation of the rate) compared 
the schemes in [7]. 



• For TE-DSTBCs, additional training symbols can be 
transmitted from the relays to the destination for chan- 
nel estimation (this is apart from the training symbols 
transmitted along with the information symbols). Since, 
additional training symbols are also needed for the codes 
proposed in [7], the existing rate advantage of our scheme 
over the scheme in [7] still holds. 

IV. Simulations Results 

In this section, we provide the performance comparison (in 
terms of the bit error rate) between the DSTBC from TE- 
CODs (given in Example Q]) and the DSTBC proposed in [7] 
for K = 4. Note that both the codes have single-real-symbol 
ML decodable property for QAM signal sets. Throughout 
this section, the designs used in [7] are referred as "CODs 
from RODs". For K = 4, the rates (in complex symbols 
per channel use in the second phase) of the DSTBCs from 
TE-COD and "COD from ROD" are respectively f and ±. 
For the two codes, both the number of channel uses and the 
energy consumption in the first phase are the same. However, 
the number of channel uses and the energy consumption in 
the second phase are different for the two codes. Hence, for 
a fair comparison, we make the bits per channel use (bpcu) 
in the second phase same for both the codes, in particular, 
we make it equal to 1.5 bpcu for the simulation purpose. To 
achieve the common rate of 1.5 bpcu in the second phase, 
the TE-COD and the "COD from ROD" respectively employs 
4-QAM signal set {-1 + i, 1 + i, -1 - i, 1 - i} and 8-QAM 
signal set{— 3 + i,— 1 + i, 1 + i, 3 + i, — 3 — i, — 1 — i, 1 — i,3+i} 
to construct the DSTBCs. Note that the 8-QAM signal set is 
not energy efficient; a more energy efficient 8-point QAM is 
{-1 + 3i, 3 + 3i, -3 + i, 1 + i, -1 - i, 3 - i, -3 - 3i, 1 - 3i}. 
However, with the use of the more energy efficient 8-point 
QAM, real symbol ML decodable property will be lost (the 
ML decoder in such a case will be single-complex symbol 
decodable). Hence, we use the 8-QAM constellation in our 
simulations. The BER performance of both codes are plotted 
against energy used per bit in Fig. [2] which shows that TE- 
COD performs better than "COD from ROD" by 0.5 db. 



V. Discussion and Conclusions 

In this paper, through a training based distributed space- 
time coding technique, we have shown to construct the variants 
of the well known class of CODs in two-hop relay networks 
using amplify and forward protocol. The inclusion of training 
symbols in to the structure of the code has been shown 
to provide high rate as well as the SSD property for the 
constructed codes. This idea can be extended to construct 
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Fig. 2. BER comparison between the DSTBC from TE-COD and the DSTBC 
from "CODs from RODs" for K = 4 with 1.5 bpcu 



DSTBCs from other SSDs like CIODs and CUWDs existing 
for point to point co-located MIMO channels to two-hop 
wireless networks [14]. 
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